Goto

Collaborating Authors

 underwater robot


USIM and U0: A Vision-Language-Action Dataset and Model for General Underwater Robots

Gu, Junwen, Wu, Zhiheng, Si, Pengxuan, Qiu, Shuang, Feng, Yukai, Sun, Luoyang, Luo, Laien, Yu, Lianyi, Wang, Jian, Wu, Zhengxing

arXiv.org Artificial Intelligence

Underwater environments present unique challenges for robotic operation, including complex hydrodynamics, limited visibility, and constrained communication. Although data-driven approaches have advanced embodied intelligence in terrestrial robots and enabled task-specific autonomous underwater robots, developing underwater intelligence capable of autonomously performing multiple tasks remains highly challenging, as large-scale, high-quality underwater datasets are still scarce. To address these limitations, we introduce USIM, a simulation-based multi-task Vision-Language-Action (VLA) dataset for underwater robots. USIM comprises over 561K frames from 1,852 trajectories, totaling approximately 15.6 hours of BlueROV2 interactions across 20 tasks in 9 diverse scenarios, ranging from visual navigation to mobile manipulation. Building upon this dataset, we propose U0, a VLA model for general underwater robots, which integrates binocular vision and other sensor modalities through multimodal fusion, and further incorporates a convolution-attention-based perception focus enhancement module (CAP) to improve spatial understanding and mobile manipulation. Across tasks such as inspection, obstacle avoidance, scanning, and dynamic tracking, the framework achieves a success rate of 80%, while in challenging mobile manipulation tasks, it reduces the distance to the target by 21.2% compared with baseline methods, demonstrating its effectiveness. USIM and U0 show that VLA models can be effectively applied to underwater robotic applications, providing a foundation for scalable dataset construction, improved task autonomy, and the practical realization of intelligent general underwater robots.


Ariel Explores: Vision-based underwater exploration and inspection via generalist drone-level autonomy

Singh, Mohit, Dharmadhikari, Mihir, Alexis, Kostas

arXiv.org Artificial Intelligence

-- This work presents a vision-based underwater exploration and inspection autonomy solution integrated into Ariel, a custom vision-driven underwater robot. Ariel carries a 5 camera and IMU based sensing suite, enabling a refraction-aware multi-camera visual-inertial state estimation method aided by a learning-based proprioceptive robot velocity prediction method that enhances robustness against visual degradation. Furthermore, our previously developed and extensively field-verified autonomous exploration and general visual inspection solution is integrated on Ariel, providing aerial drone-level autonomy underwater . The proposed system is field-tested in a submarine dry dock in Trondheim under challenging visual conditions. The field demonstration shows the robustness of the state estimation solution and the generalizability of the path planning techniques across robot embodiments.


PierGuard: A Planning Framework for Underwater Robotic Inspection of Coastal Piers

Wang, Pengyu, Lin, Hin Wang, Li, Jialu, Wang, Jiankun, Shi, Ling, Meng, Max Q. -H.

arXiv.org Artificial Intelligence

Using underwater robots instead of humans for the inspection of coastal piers can enhance efficiency while reducing risks. A key challenge in performing these tasks lies in achieving efficient and rapid path planning within complex environments. Sampling-based path planning methods, such as Rapidly-exploring Random Tree* (RRT*), have demonstrated notable performance in high-dimensional spaces. In recent years, researchers have begun designing various geometry-inspired heuristics and neural network-driven heuristics to further enhance the effectiveness of RRT*. However, the performance of these general path planning methods still requires improvement when applied to highly cluttered underwater environments. In this paper, we propose PierGuard, which combines the strengths of bidirectional search and neural network-driven heuristic regions. We design a specialized neural network to generate high-quality heuristic regions in cluttered maps, thereby improving the performance of the path planning. Through extensive simulation and real-world ocean field experiments, we demonstrate the effectiveness and efficiency of our proposed method compared with previous research. Our method achieves approximately 2.6 times the performance of the state-of-the-art geometric-based sampling method and nearly 4.9 times that of the state-of-the-art learning-based sampling method. Our results provide valuable insights for the automation of pier inspection and the enhancement of maritime safety. The updated experimental video is available in the supplementary materials.


AI-Enhanced Automatic Design of Efficient Underwater Gliders

Chen, Peter Yichen, Ma, Pingchuan, Hagemann, Niklas, Romanishin, John, Wang, Wei, Rus, Daniela, Matusik, Wojciech

arXiv.org Artificial Intelligence

-- The development of novel autonomous underwater gliders has been hindered by limited shape diversity, primarily due to the reliance on traditional design tools that depend heavily on manual trial and error . Building an automated design framework is challenging due to the complexities of representing glider shapes and the high computational costs associated with modeling complex solid-fluid interactions. In this work, we introduce an AI-enhanced automated computational framework designed to overcome these limitations by enabling the creation of underwater robots with non-trivial hull shapes. Our approach involves an algorithm that co-optimizes both shape and control signals, utilizing a reduced-order geometry representation and a differentiable neural-network-based fluid surrogate model. This end-to-end design workflow facilitates rapid iteration and evaluation of hydrodynamic performance, leading to the discovery of optimal and complex hull shapes across various control settings. We validate our method through wind tunnel experiments and swimming pool gliding tests, demonstrating that our computationally designed gliders surpass manually designed counterparts in terms of energy efficiency. By addressing challenges in efficient shape representation and neural fluid surrogate models, our work paves the way for the development of highly efficient underwater gliders, with implications for long-range ocean exploration and environmental monitoring.


Bacteria-inspired robot uses 12 spinning flagella to roam underwater

New Scientist

An underwater robot can delicately propel itself in any direction with its 12 flexible arms, inspired by the flagella of bacteria. Its creators claim it can carry out underwater inspections without endangering humans or wildlife, as propeller-driven robots would. Flagella are tiny, hair-like protrusions found on many bacteria that can spin clockwise or counterclockwise to create propulsion. "[Bacteria] have something called a biological motor, which rotates this elongated structure, and this elongated structure produces thrust, and that's how bacteria is propelled," says Anup Teejo Mathew at Khalifa University in Abu Dhabi,…


Cross-platform Learning-based Fault Tolerant Surfacing Controller for Underwater Robots

Hamamatsu, Yuya, Remmas, Walid, Rebane, Jaan, Kruusmaa, Maarja, Ristolainen, Asko

arXiv.org Artificial Intelligence

In this paper, we propose a novel cross-platform fault-tolerant surfacing controller for underwater robots, based on reinforcement learning (RL). Unlike conventional approaches, which require explicit identification of malfunctioning actuators, our method allows the robot to surface using only the remaining operational actuators without needing to pinpoint the failures. The proposed controller learns a robust policy capable of handling diverse failure scenarios across different actuator configurations. Moreover, we introduce a transfer learning mechanism that shares a part of the control policy across various underwater robots with different actuators, thus improving learning efficiency and generalization across platforms. To validate our approach, we conduct simulations on three different types of underwater robots: a hovering-type AUV, a torpedo shaped AUV, and a turtle-shaped robot (U-CAT). Additionally, real-world experiments are performed, successfully transferring the learned policy from simulation to a physical U-CAT in a controlled environment. Our RL-based controller demonstrates superior performance in terms of stability and success rate compared to a baseline controller, achieving an 85.7 percent success rate in real-world tests compared to 57.1 percent with a baseline controller. This research provides a scalable and efficient solution for fault-tolerant control for diverse underwater platforms, with potential applications in real-world aquatic missions.


Underwater Soft Fin Flapping Motion with Deep Neural Network Based Surrogate Model

Hamamatsu, Yuya, Kupyn, Pavlo, Gkliva, Roza, Ristolainen, Asko, Kruusmaa, Maarja

arXiv.org Artificial Intelligence

This study presents a novel framework for precise force control of fin-actuated underwater robots by integrating a deep neural network (DNN)-based surrogate model with reinforcement learning (RL). To address the complex interactions with the underwater environment and the high experimental costs, a DNN surrogate model acts as a simulator for enabling efficient training for the RL agent. Additionally, grid-switching control is applied to select optimized models for specific force reference ranges, improving control accuracy and stability. Experimental results show that the RL agent, trained in the surrogate simulation, generates complex thrust motions and achieves precise control of a real soft fin actuator. This approach provides an efficient control solution for fin-actuated robots in challenging underwater environments.


Harnessing the Power of Vibration Motors to Develop Miniature Untethered Robotic Fishes

Jiang, Chongjie, Dai, Yingying, Le, Jinyang, Chen, Xiaomeng, Xie, Yu, Zhou, Wei, Niu, Fuzhou, Li, Ying, Luo, Tao

arXiv.org Artificial Intelligence

Miniature underwater robots play a crucial role in the exploration and development of marine resources, particularly in confined spaces and high-pressure deep-sea environments. This study presents the design, optimization, and performance of a miniature robotic fish, powered by the oscillation of bio-inspired fins. These fins feature a rigid-flexible hybrid structure and use an eccentric rotating mass (ERM) vibration motor as the excitation source to generate high-frequency unidirectional oscillations that induce acoustic streaming for propulsion. The drive mechanism, powered by miniature ERM vibration motors, eliminates the need for complex mechanical drive systems, enabling complete isolation of the entire drive system from the external environment and facilitating the miniaturization of the robotic fish. A compact, untethered robotic fish, measuring 85*60*45 mm^3, is equipped with three bio-inspired fins located at the pectoral and caudal positions. Experimental results demonstrate that the robotic fish achieves a maximum forward swimming speed of 1.36 body lengths (BL) per second powered by all fins and minimum turning radius of 0.6 BL when powered by a single fin. These results underscore the significance of employing the ERM vibration motor in advancing the development of highly maneuverable, miniature untethered underwater robots for various marine exploration tasks.


Underwater robot discovers a never-before-seen creature at the junction of three tectonic plates in the Pacific Ocean - as baffled viewers dub it the 'forbidden toilet scrubber'

Daily Mail - Science & tech

At first glance at this creature, you'd be forgiven for mistaking it for a sparkly pair of fake eyelashes. But the creature is very much real and was discovered at the junction of three tectonic plates in the Pacific Ocean. Researchers from the Schmidt Ocean Institute spotted the animal while using an underwater robot to scour the seabed. The animal is a polychaete - a class of marine worms, more widely known as bristle worms. 'To describe this polychaete, one simply must use jazz hands -- it is the only way to capture this deep-sea worm's dazzle!' the experts said in an Instagram post about the polychaete.


Enhancing Depth Image Estimation for Underwater Robots by Combining Image Processing and Machine Learning

Nguyen, Quang Truong, Canh, Thanh Nguyen, HoangVan, Xiem

arXiv.org Artificial Intelligence

Depth information plays a crucial role in autonomous systems for environmental perception and robot state estimation. With the rapid development of deep neural network technology, depth estimation has been extensively studied and shown potential for practical applications. However, in particularly challenging environments such as low-light and noisy underwater conditions, direct application of machine learning models may not yield the desired results. Therefore, in this paper, we present an approach to enhance underwater image quality to improve depth estimation effectiveness. First, underwater images are processed through methods such as color compensation, brightness equalization, and enhancement of contrast and sharpness of objects in the image. Next, we perform depth estimation using the Udepth model on the enhanced images. Finally, the results are evaluated and presented to verify the effectiveness and accuracy of the enhanced depth image quality approach for underwater robots.